Skip to content

Conversation

@pvary
Copy link
Contributor

@pvary pvary commented Nov 19, 2020

No description provided.

@github-actions github-actions bot added the build label Nov 19, 2020
@rdblue
Copy link
Contributor

rdblue commented Nov 19, 2020

Looks like the log is too long with this enabled:

This file has been truncated. View the full logs in the ... menu once the check is completed.

Maybe we can output logs to files for each module and fetch them when we need them? We can look into getting the files produced.

@github-actions github-actions bot added the INFRA label Nov 19, 2020
@github-actions github-actions bot added the MR label Nov 19, 2020
@pvary
Copy link
Contributor Author

pvary commented Nov 20, 2020

Looks like the log is too long with this enabled:

This file has been truncated. View the full logs in the ... menu once the check is completed.

It is still possible to see the logs by downloading them starting from the ... from the top right corner of the Details screen.
But I absolutely agree that this is not ideal since we might need to download the full logs to find the failures. That is why I followed up on your suggestion below.

Maybe we can output logs to files for each module and fetch them when we need them? We can look into getting the files produced.

I was able to come up with a solution where I put the output of the tests for subproject specific files, and added them as an artifacts for the builds in case of a failure. This archive is accessible under the Artifacts button. See: https://github.com/apache/iceberg/runs/1431325389 (intentionally created some test failures)

Sample output (build/testlogs/iceberg-parquet.log):

--------
- Test log for: Test testRowGroupSizeConfigurableWithWriter(org.apache.iceberg.parquet.TestParquet)
--------
StdErr log4j:WARN No appenders could be found for logger (org.apache.hadoop.util.NativeCodeLoader).
StdErr log4j:WARN Please initialize the log4j system properly.
StdErr log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
--------
- Test log for: Test testListProjection(org.apache.iceberg.avro.TestParquetReadProjection)
--------
StdErr [Test worker] INFO org.apache.parquet.hadoop.InternalParquetRecordReader - RecordReader initialized will read a total of 1 records.
StdErr [Test worker] INFO org.apache.parquet.hadoop.InternalParquetRecordReader - at row 0. reading next block
StdErr [Test worker] INFO org.apache.parquet.hadoop.InternalParquetRecordReader - block read in memory in 1 ms. row count = 1

Since I think this could be useful for every time we run the tests I have configured this logging not only for CI runs but for general test runs as well - some might disagree so please check this.

Thanks,
Peter

with:
name: test logs
path: |
**/build/testlogs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we just need to archive **/build/reports. The reports have the full stack traces and usually have any stderr/stdout for each test. That may be easier than creating a separate file since I think we already aggregate the logs there.

@rdblue
Copy link
Contributor

rdblue commented Nov 20, 2020

@pvary, looks great! Thanks for looking into the archive options. I think we may be able to do this more simply by archiving reports instead of creating new log files. The reports I've used have had stderr/stdout.

@pvary
Copy link
Contributor Author

pvary commented Nov 20, 2020

@pvary, looks great! Thanks for looking into the archive options. I think we may be able to do this more simply by archiving reports instead of creating new log files. The reports I've used have had stderr/stdout.

Thanks for reviewing @rdblue!
I have tried archiving reports as a first option. You can see the results here in the artifact: https://github.com/apache/iceberg/runs/1429922302

I found them lacking because of 2 reasons:

  1. Hive often does not propagate the exception to the client (the theory is that infra errors should not be handled by the users), so Hive based tests produce the following exceptions without enough details to investigate (INFO logs contain the real Exception):
org.apache.iceberg.mr.hive.TestHiveIcebergStorageHandlerWithCustomCatalog > testScanTable[fileFormat=PARQUET, engine=tez] FAILED
    java.lang.IllegalArgumentException: Failed to execute Hive query 'SELECT * FROM default.customers ORDER BY customer_id DESC': Error while processing statement: FAILED
: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.tez.TezTask
        Caused by:
        org.apache.hive.service.cli.HiveSQLException: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.tez.TezTask
  1. Having separate directories for listing the tests for every subproject does not help in identifying the failing test at glance

For the 2nd option I have tried the official solution for aggregating the results, but it only does the aggregation if there is no test failure which is less than ideal 😄

So that is why I have decided to backtrack to the "have a file for a subproject output and keep the original test output" solution.

After my trip to the test output aggregation I feel that if we want to have aggregate test results then it should be done in a different PR and here we should concentrate on the archiving the logs

@rdblue
Copy link
Contributor

rdblue commented Nov 20, 2020

I thought that logs printed to stderr and stdout were available in the test reports. Maybe I'm wrong though.

For the second problem, I thought that you'd usually know which test had failed and go looking for its results.

@rdblue rdblue merged commit 275ec9a into apache:master Nov 20, 2020
@rdblue
Copy link
Contributor

rdblue commented Nov 20, 2020

I'll merge this so that we have something working to debug these cases. We can always improve on it later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants